Reparametrizing the loop entropy weights: effect on DNA melting curves 
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Recent advances in the understanding of the melting behavior of double-stranded DNA with 
statistical mechanics methods lead to improved estimates of the weight factors for the dissociation 
events of the chains, in particular for interior loop melting. So far, in the modeling of DNA melting, 
the entropy of denaturated loops has been estimated from the number of configurations of a closed 
self-avoiding walk. It is well understood now that a loop embedded in a chain is characterized by a 
loop closure exponent c which is higher than that of an isolated loop. Here we report an analysis of 
DNA melting curves for sequences of a broad range of lengths (from 10 to 10^ base pairs) calculated 
with a program based on the algorithms underlying MELTSIM. Using the embedded loop exponent 
we find that the cooperativity parameter is one order of magnitude bigger than current estimates. 
We argue that in the melting region the double helix persistence length is greatly reduced compared 
to its room temperature value, so that the use of the embedded loop closure exponent for real DNA 
sequences is justified. 

PACS numbers: 87.14.Gg, 87.15. He, 05.70.Fh, 64.10+h 
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I. INTRODUCTION 

A standard method to unbind the two strands of a 
double-stranded DNA (dsDNA) in solution is that of in- 
creasing the temperature of the system. This process, 
known as DNA thermal denaturation or DNA melting, 
has been studied since the sixties as it provides important 
information about the interaction between base pairs, the 
stability of the double helix, the effect of the solvent and 
of the salt concentration (for a review see Ref. Q). 

Several techniques are currently available for the in- 
vestigation of DNA thermal denaturation such as UV 
absorption, differential scanning calorimetry, circular 
dichroism, NMR, fluorescence emission and temperature 
gradient gel electrophoresis Perhaps the most estab- 
lished of these is the UV absorption method which con- 
sists in irradiating the sample with UV light at 270 nm, 
a wavelength which is preferentially absorbed by single 
stranded DNA (ssDNA). The total fraction of absorbed 
light, is therefore simply proportional to the fraction of 
dissociated base pairs of the sequence and provides a di- 
rect measurement of the order parameter of the problem. 
In experimental studies, rather than analyzing directly 
A, it is customary to consider its temperature derivative: 
the plot of dA/dT vs. T is known as the differential melt- 
ing curve 0, but we will simply refer to it, in the rest of 
the paper, as the melting curve. 

Melting curves can show a single or several peaks (typ- 
ical examples, as observed in experiments, are shown in 
Fig. ^ whose positions and heights depend on the se- 
quence length and composition as well as on external 
parameters as the salt concentration. For very short se- 
quences, i.e. of about 10^ — 10'^ base pairs (bps), the 
melting curve shows a single peak indicating a sudden 
unbinding of the two strands (see Fig. da)). The peak 
is rounded by finite size effects and thus can become quite 
broad for very short chains. Somewhat longer sequences 
(« 10'^ — 10^ bps) are instead characterized by several 
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FIG. 1: Schematic view of possible differential melting curves 
observed for (a) short (~ 10^ bps), (b) intermediate (~ IQf^ — 
10* bps) and (c) long (> 10^ bps) DNA sequences. 



peaks of typical width of about 0.5°C or less (Fig. Efb)). 
These peaks are the signatures of sharp transitions of co- 
operatively melting regions, as for instance inner loop 
openings or the unbinding of double stranded regions 
at the edges of the chain. Finally, in very long chains 
(w 10^ bps) there is again only a single broad peak cov- 
ering about 15— 20°C, which is actually the superposition 
of many distinct peaks associated with the denaturation 
of single domains [Fig. ^c)]. These single peaks can- 
not be resolved anymore for such long sequences and the 
melting curve becomes again rather featureless. 

The computational prediction of DNA melting curves 
is a basic bioinformatics task which is needed for a large 
variety of applications such as primer design, DNA con- 
trol during Polymerase Chain Reaction or mutation anal- 
ysis [3- Also the denaturation behavior of DNA se- 
quences of genomic size is of interest for studies of se- 
quence complexity and evolution and for gene identifica- 
tion and mutation 0, • Consequently, specialized 
tools, such as Meltsim were developed for this pur- 
pose. 

The first attempts to model the DNA denaturation 
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dates back from the sixties (for a recent review see 
Ref. 0). The simplest model employed for this pur- 
pose was the one-dimensional Ising model where the two 
spin states Si = 0, 1 represent the open (si = 0) or 
closed (sj — 1) configuration for complementary base 
pairs 0, ■ This model obviously does not provide a 
real thermodynamical phase transition but only a smooth 
crossover between the closed (dsDNA) and open (ssDNA) 
regimes. The inclusion of an entropic term, which takes 
into account the number of possible configurations of the 
denaturated loops, induces an effective long range inter- 
action in the system and thus a genuine phase transition 
may occur, as shown long ago by Poland and Sheraga 

On rather short DNA sequences (~ 10^ bps) the loop 
entropy contribution is not very important as loops are 
rare and rather short and the DNA denaturates mainly 
trough unbinding from the edges. A description based on 
the Id Ising model with appropriate experimentally de- 
termined energy parameters is therefore sufficient (see, 
e.g., Ref. lis]). A calculation aimed at reproducing ex- 
perimental melting curves with several peaks of different 
heights and widths [as that shown in Fig. ^b)] however 
needs, besides accurate energy parameters, also a good 
estimate of the entropy of the denaturated loops. 

A broadly known method to calculate DNA melting 
curves for a given input sequence is the Poland algorithm 
[l^ in which the probability for each base pair to be in 
an open or closed state can be calculated from a set of 
recursive relations. The entropy of the denaturated loops 
is given by counting the number of configurations for a 
closed self-avoiding walk This approach overesti- 

mates the actual entropy as it does not take into account 
that the number of configurations available to the loop is 
limited by the presence of the rest of the chain. Recent 
advances in the statistical mechanics of polymers pro- 
vided new insight on how to calculate entropies for loops 
which are embedded in chains 0, 0| . 

The aim of this paper is to discuss whether DNA ther- 
mal denaturation experiments are able to fix unequivo- 
cally the form of the parameters involved in the entropy 
of denaturated loops. We analyze the effect of the im- 
proved loop entropy estimate on the melting curves for 
DNA sequences of various lengths (up to 5 x 10^ base 
pairs) and compositions. We show that the modification 
of the entropic parameters associated with the loops may 
have quite a strong effect on the melting curves, espe- 
cially for sequences of intermediate lengths where several 
peaks are present [as in the example of Fig. db)]. We 
also discuss how the effects described here can be best 
measured experimentally. 

This paper is organized as follows: In Sec. ^we review 
some recent results on the entropy of loops embedded into 
a chain. In Sec. IIIII we present some melting curves for 
sequences of various lengths and investigate the effect of 
a modification of the parameters associated with the loop 
entropy on these curves. In Sec. IIVI we discuss how the 
double helix rigidity could influence the results for the 



entropy, while Sec. Ivl concludes the paper. 



II. ENTROPY OF A LOOP EMBEDDED IN A 
CHAIN 

In the Poland algorithm the partition function of a loop 
of total length I is given by the number of conflgurations 
of a self-avoiding walk returning for the first time to the 
origin after I steps (Fig. |21[a)), which in the limit 
I — > oo assumes the following form p^: 

L, ^ (1) 

where /i is a nonuniversal geometric factor, while c, the so 
called loop closure exponent, is a universal quantity. In 
Eq. 1^ we also included a prefactor cr, which only makes 
sense when comparing the loop partition function with 
that of a double stranded helix, and measure the absolute 
probability of interrupting a double helix to open a loop. 
This quantity is known as the cooperativity parameter. 

A small value of a suppresses loop formation so that 
loops proliferate only at temperatures very close to the 
melting point and are typically large in order to minimize 
the effect of a small a. The transition becomes highly 
cooperative, in the sense that large portions of the chain 
will tend to unbind simultaneously. On the contrary, for 
a cr « 1, there is no extra big penalty for loop openings 
and many small loops may form already well below the 
melting point. The transition is less cooperative, and 
peaks in the melting curves appear more rounded-off. 

For self-avoiding walks embedded in a three dimen- 
sional space the exponent c has been estimated numeri- 
cally to be c w 1.75 Other loop parameters have to 
be fixed by fitting to the available experimental data. In 
particular, a lot of effort has been devoted to the mea- 
surement of the cooperativity parameter a. Its most ac- 
curate determination was performed by Blake and Del- 
court who analyzed the melting of several tandemly 
repeated inserts on a linearized plasmid DNA and found 
that the value a = 1.26 x 10~^ fits best the experimental 
melting curves. This value is consistent with previous 
estimates [1 IH HI El . 

Recent developments in the field of polymer physics 
allow us to calculate the total number of configurations 
for a loop embedded in a chain, as for those shown in Fig. 




FIG. 2: The total number of configuration for a self-avoiding 
loop is given by Eq. Q with c ~ 1.75 for an isolated loop (a) 
and c ~ 2.15 for a loop embedded in a chain (b). 
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|2Ib). Analytical estimates, relying on the general theory 
of polymer networks , indicate that the form given in 
Eq. is still valid, however with a higher value for the 
exponent c The reason for the higher value of c is the 
lower number of configurations available for the loop due 
to the presence of the rest of the chain, compared to the 
number of configurations available for an isolated loop. 
Monte Carlo simulations on suitable three dimensional 
lattice models yield c w 2.15 0,13' ^ value which is also 
in agreement with analytical estimates, which place this 
exponent in the range 2.10 ^ c ^ 2.20 [la|. It is also well- 
known [l^lTsI l that increasing the value of c has an effect 
of sharpening the transition. A value c > 2 implies a first 
order transition in the case that the energy difference 
between different base pairs is neglected [liE^; first 
order melting was indeed found numerically 25'|. 

It is clear that accepting c — 2.15 as the most appropri- 
ate estimate of the loop exponent entering in the parti- 
tion function JQ) implies that the existing estimate of the 
cooperativity parameter a has to be revisited as its ex- 
perimental determinations relied on the choice c = 1.75. 
The aim of this paper is to investigate this issue in de- 
tail. To this purpose we analyze the melting curves for 
DNA sequences of a wide range of lengths up to the full 
genome of the E. coli which amounts to 5 x 10^ bps. 



III. CALCULATION OF MELTING CURVES 

The calculations of DNA melting curves are obtained 
from our own C-version of the Meltsim package (for de- 
tails concerning this program, see Ref. which is a 
program based on the Poland recursive algorithm for the 
calculation of the probabilities that the i-th base 
pair is in a open state. The program uses the Fixman- 
Friere method, which consists of approximating the 
loop partition function of Eq. as a sum of / exponen- 
tials: 



(2) 



fc=i 



where and are fitting coefficients. As for a sequence 
of total length N the longest possible loop corresponds 
to Z = 27V, the fitting is done in a limited interval of 
lengths. Therefore, the shorter the sequence, the smaller 
is the number of coefficient J necessary to obtain a very 
good fit of Eq. J^l for the relevant loop lengths for the 
problem. Typically a sum with / « 10 coefficients is 
sufficient for our needs. The advantage of the exponential 
approximation is that it reduces the computational time 
for a sequence of N base pairs from O(iV^) to 0(N x /) 
without any significant loss in accuracy |2g | . 

For various temperatures we calculated the prob- 
ability that the i-th base pair is in an open state. From 
numerical differentiation of the average: 
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TABLE I: Domains from pN/MCSx from Ref. Hll considered 
here as insertions into purely CG-domains. 



\. (x) Length (N) 


Composition 


1 


155 


[ACTCGGACG A] 15 ACTCG 


2 


305 


[ACTCGGACG A] 30 ACTCG 


5 


335 


[ACTCGGACG A] 33 ACTCG 


10 


500 


[AAGTTGAACAAAT]38AAGTTG 


11 


747 


[AAGTTGAACAAAT]57AAGTTG 


12 


214 


[AAGTTGAACAAAT] le AAGTTG 


13 


245 


[AAGTTG A ACAA A AT] 17 A AGTTGA 


14 


203 


[AAGTTG A ACAA A AT] m A AGTTGA 


3 


135 


[A AGTTGA AC AT] 13 AAGTTG 


6 


330 


[A AGTTG AACAAT] 27 AAGTTG 


15 


292 


[AGTGACAT]36AGTG 



(a) 



(b) 



. I I I I I I I r 



-> I I 



FIG. 3: Schematic view of the melting of the AT-rich insert 
(gray) in the case of (a) edge insertion and (b) inner inser- 
tion as performed in Ref. |l9||. The difference in the melting 



temperatures for the two cases ATm = T^"^ - 
estimate of the cooperativity parameter. 



T^'' aUows to 



(3) 



for a sequence of N base pairs, we obtained the melting 
curves dA/dT vs. T. 

In this work we have chosen the ten stacking energy 
parameters given in Ref. |^. Another parameter that 
needs to be specified in the calculation is the concen- 
tration of the monovalent salt in solution, which in our 
calculation varies between 0.05 M and 0.1 M. 



A. Melting of short sequences 

We first analyzed a series of existing experimental data 
I!) for short sequences consisting of oligomeric repeat 
patterns shown in TableQ] These sequences were inserted 
in a linearized recombinant pN/MCSa; plasmid {x abbre- 
viates a specific sequence). The insertions are AT-rich 
and melt at lower temperatures than the rest of the plas- 
mid chain, which plays the role of an energetic barrier 
to prevent further melting. The insertions were placed 
at one end, and in the middle of the linearized plasmid, 
thus their melting occurs through end or loop openings, 
respectively (see Fig. 

The experimental results, tabulated in Ref. [l^ , are re- 
analyzed here using the exponent c = 2.15, appropriate 
for a loop inserted in a chain. To induce both loop and 
end openings we embedded the sequences ofTablelJin the 
middle and at one end of a pure CGCGCG chain. Due 
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= 1.75 a=1.26xl0" 
= 2.15 0=1.26x10^' 
= 2.15 0=1.75x10"' 




75.4 



FIG. 4: Calculated melting curves for the Seq. 11 inserted in 
the middle of a long CGCG. . . matrix, for 0.075 M of mono- 
valent salt concentration and three choices of c and a. 
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FIG. 5: Plot of ATaj as function of the inverse domain size. 
Empty circles are experimental data taken from [Tgt (Table 
I) dashed and dotted lines refer to two choices of c and g. 



to the higher stability of the CG-domains, their melting 
vifill occur at much higher temperatures than those of the 
inserted sequences. 

Figure 21 shows a plot of the differential melting curves 
for the Seq. 11 of Tabled embedded in the middle of 
a CG matrix, as calculated by our program. We first 
fixed a = 1.26 x 10^'^, the value reported in The 
change of c from 1.75 to 2.15 causes a shift of the posi- 
tion of the melting curves peak of about 0.2° G and a 
slight increase of height (the peak position defines the 
loop melting temperature T^°^). By choosing c = 2.15, 
a = 1.75 X 10^* one recovers a melting curve which al- 
most perfectly overlaps the original one obtained with 
c = 1.75, a = 1.25 x 10~^ (solid and dashed fines of Fig. 
ip. This curve fits well the experimental data (see Ref. 
|19|'). thus we conclude that for the choice c = 2.15 and 
a = 1.75 X 10~* yields a melting curve consistent with 
the experimental values. 

In order to provide an estimate of a from several in- 
dependent measurements we reanalyzed the procedure 
followed in Ref. 0. Figure El shows experimental data 
(empty circles) for the temperature difference of the se- 
quences of TablejH ATm = T]^°^ - T^f, plotted as func- 
tions of the inverse domain length 1 /N (data taken from 
Table I of Ref. [H). Here r|°°P and T^f^ denote the lo- 
cation of the maxima of dA/dT for sequences inserted in 
the interior and at the end of the plasmid chain. The 
dashed line shows the calculated ATm in the case of 
c = 1.75 <T = 1.26 X 10^^, where the latter value was 
determined using a regression analysis to fit the experi- 
mental data. We have repeated this procedure here fixing 
c = 2.15 for which we find an equally good fit of the data 
with the choice a = 1.25 x 10~^. Given the precision of 
the experimental data which we could not assess or an- 
alyze further, we find that our calculated curve matches 
the data very well. Deviations between both theoretical 
curves clearly appear for shorter chains (loops) where the 



application of the asymptotic form of the loop partition 
function of Eq. may not be appropriate. 



B. Melting of sequences of intermediate length 

We next consider the melting of two sequences of in- 
termediate length. We used two protein-coding cancer- 
related genes, eIF-4G (2900 bps) and LAMGl (7900 bps), 
selected from a series of other sequences analyzed for 
their predominant loop melting effects. 

Figure inja) shows the melting curves for three differ- 
ent values of c and a for a fragment of DNA of eIF-4G as 
calculated from our program. Four major distinct peaks, 
labeled from 1 to 4 are visible. The denaturation maps, 
i.e. plots of 1 — A(i) as function of i, the base position 
along the chain, which are shown in Fig. (Hlb), provide 
further insight on the type of transitions associated with 
each peak. We recall that 1 — A{i) is the average prob- 
ability that the i-th base pair is in a closed state. The 
six plots of Fig. Elb), labeled as a, /3 . . . 0, correspond 
to the six temperatures marked by the vertical arrows in 
Fig. Ita). 

Comparing the configurations at the temperatures just 
below and above the peak 3 (7 and 5) we notice that 
this peak corresponds to the opening of a big loop ex- 
tending roughly from base pair 1300 to 2300. The melt- 
ing curves were calculated for three different sets of pa- 
rameters, starting from the standard choice c = 1.75, 
the value associated with isolated loops, together with 
a — 1.26 x 10^^, which is the most recent estimate for 
the cooperativity parameter |19| |. While keeping a fixed 
we first consider the exponent for a loop embedded in 
the chain c = 2.15. For such a combination of parameters 
the melting peaks 1 and 2 are shifted from their positions, 
while peak 4 vanishes in the signal background. The main 
infiucnce of a change in c is on peak 3, which for c = 2.15 
becomes roughly twice as high (dA/(ir|,„ax ~ 1-1 com- 
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FIG. 6: (a) Melting curves for the sequence eIF-4G for three choices of c and a and 0.05 M of monovalent salt, (b) Denaturation 
maps for c = 2.15, a = 1.26 x 10~^ calculated at the six different temperatures (labeled by a, /3 ... ) indicated by vertical 
arrows in (a). 
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FIG. 7: (a) Plot of the melting curves for the DNA sequence of LAMCl for three choices of the parameters c and a. (b) 
Denaturation maps for c = 2.15, cr = 1.26 x 10"* at the six different temperatures marked in (a) by the vertical arrows. Inset: 
Blow up of the denaturation maps /3 and 7 for c = 2.15, a — 1.26 x 10"'* (thick solid lines) and c — 2.15, a = 1.26 x 10"^ (thin 
solid lines). 



pared to dA/(iT|„iax ~ 0.6 in the case c = 1.75). The 
reason for this is that, as shown from the study of the 
denaturation maps, the peak 3 corresponds to the open- 
ing of a long loop of about 1000 bps, thus any modifica- 
tion of the parameters entering in the loop entropy will 
have a particularly strong effect on it. Increasing c, while 
kee ping a fixed makes the transition sharper, as expected 

mm 

The third melting curve we plotted is obtained again 
with c = 2.15, but with a cooperativity parameter which 
is one order of magnitude larger: a — 1.26 x 10~*. This 
curve overlaps quite well with the original one. The re- 
gion where the first and third curves differ the most is 
around peaks 3 and 4, where the chain contains a 1000 
bps long loop, and is shown in the inset of Fig. EJa). 
Even in this region the differences are not very big: e.g., 
the shift of the maximum for peak 3 is ATmax ~ 0.1°, 



while both peaks keep the same heights. 

We repeated similar calculations for other sequences in 
the same range of lengths. Figure[7Ja) shows the melting 
curves for a fragment of LAMCl about 7900 bps long, for 
the same choice of parameters as in Fig. |H| Compared to 
the previous example there is a larger number of peaks, 
as the sequence is more than twice as long as the previ- 
ous one and more subtransitions take place. The most 
relevant difference between the melting curves calculated 
for different values of c and cr is within the region around 
81°C, where a melting peak doubles its height going from 
the original choice of c = 1.75 to c — 2.15, while keeping 
the cooperativity parameter fixed at cr = 1.26 x 10^^. As 
in the eIF-4G sequence, when a is rescaled by a factor 
10 we obtain a melting curve running extremely close to 
the original one in the whole range of temperatures. 

It is interesting to take a closer look at the average con- 
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figurations around the peak at T = 80.5°C. Tlie inset 
of Fig. EJb) shows a blow up of the denaturation maps 
f3 and 7 in a region around base pairs i = 3500 — 4000, 
where a loop opening occurs. While keeping c — 2.15 we 
plot in the inset the maps both for a — 1.26 x 10^^ (thin 
lines) and cr = 1.26 x 10^'* (thick lines). It is remark- 
able how little thick and thin lines differ, compared to 
the big effect of a change of a in the differential melting 
curves of Fig. CJa). The "robustness" of the denatura- 
tion maps with respect to a change of the thermodynamic 
^irameters has also been observed recently by Yeramian 
The curves in the inset clearly demonstrate the in- 
crease of cooperativity when a is decreased: Either loops 
are suppressed (case (3) or two neighboring loops tend to 
merge (case 7) in order to minimize the effect of a small 
value of fj. Although the peak at 83° C corresponds to 
the formation of a loop of about 2000 bps, as can be 
seen from the denaturation maps 6 and e, a change in 
the loop parameters c and a does not seem to modify 
much this peak, as the transition considered is not an 
opening of a double helical segment, but rather a merg- 
ing of two loops with a corresponding enlargement. We 
conclude that loops merging transitions are less affected 
by a change in c and a as compared to a genuine loop 
opening. 

We notice also that changing the loop parameters do 
not affect at all the small peak at T w 89°C, because 
this corresponds to a transition not involving inner loops, 
but rather the melting of the region < z < 900 through 
opening from the edges (see in Fig. Cl^b)). 



C. Melting of long sequences (E-Coli) 

In order to further asses the validity of the previous 
analysis we calculated melting curves for much longer 
DNA sequences (~ 10^ bps). Longer sequences have the 
advantage that they develop loops of a broad range of 
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FIG. 8: Melting curves for the whole DNA for E. coli (? 
4,500,000 bps). 



length scales, thus one may test the influence in a change 
of loop parameters simultaneously for very short and very 
long loops. Unfortunately a limiting factor in this case 
is that, as pointed out in the introduction, the melting 
curves are rather featureless (as for Fig. H^c)). 

Figure|S|show the calculated melting curves for the full 
DNA of E. coli which is of about 4.6 x 10^ bps with a 
known sequence (taken from The curves are al- 

most continuous, although some irregular structure is 
still visible, indicating that the discrete sharp peaks of 
the underlying sequence have not been completely av- 
eraged out. The melting curves extend now over about 
15°C and have a typical asymmetric shape, with a gen- 
tle growth in the low temperature region, and a steeper 
descent above Tmax, the temperature for which dA/dT is 
maximal. 

As before we calculate the melting curves for three dif- 
ferent combinations of the parameters c and a. Again 
we find that by changing the loop closure exponent from 
c = 1.75 to c — 2.15 while rescaling of a by one order 
of magnitude brings the curve back to the original one. 
The two curves obtained in this way are perfectly over- 
lapping. A change of c only, while leaving a — 1.26 x 10^ 
makes the melting curve somewhat sharper as observed 
previously for shorter sequences. However the increase of 
the peak height is limited to about 10%, a rather small 
effect compared to the doubling of the height found for 
some peaks of Figs. |Sland[7| 

It is interesting to notice that by changing the loop 
parameters c and a the part of the melting curves for 
T < Tmax apparently is not modified much as all three 
curves plotted in Fig. |Slrun with good accuracy on top of 
each other, in this temperature interval. It is only at at 
T ~ Tlnax and above that the curves start separating, as 
shown in the inset. We also point out that the compari- 
son of experimental and calculated melting curves for E. 
coli of Ref. Sj show some differences in the high temper- 
ature region T > Tmax- These differences could be due 
to some non-equilibrium effects, which are known to be 
more severe for very long sequences 0. 



IV. EFFECT OF THE DOUBLE HELIX 
STIFFNESS 



There has been some discussion recently about the in- 
fluence of the stiffness of the double helix on the exponent 
c |2^|23 . In the very ideal limit of a loop attached to two 
infinitely rigid rods, the appropriate value of the loop ex- 
ponent is the same of that of an isolated loop c = 1.75, as 
the loop does not interact with the rest of the chain. The 
dsDNA has a persistence length which is typically much 
larger than that of the ssDNA (^ds ^ ^ss), therefore it 
is legitimate to question the applicability of the polymer 
network theory |23 | , from which a higher loop exponent 
c w 2.1 was derived 01 to real DNA sequences. 

A different persistence length between dsDNA and ss- 
DNA was incorporated in lattice Monte Carlo simulations 
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[T?! . In these calculations the exponent c, estimated from 
an analysis of the distributions of the loop lengths using 
realistic values for ^^s and ^ss, was found to be consistent 
with that of the case in which the difference between 
persistence lengths is neglected. However these lattice 
models did not incorporate a cooperativity parameter. 

In Ref. '28'| it was estimated that sequences up to 
5000 bps are still too short to show any interaction ef- 
fects between a loop with the rest of the chain, so that 
the appropriate value of the loop exponent should be that 
of an isolated loop c « 1.75. There is however still dis- 
agreement about this conclusion [29| . 

Here we would like to point out some effects which 
have been overlooked in the present literature. The main 
interest in the exponent c is that of modeling the DNA 
melting curves and the melting process typically takes 
place at around 80°-90° C (see Figs. El[Z|and|Hl). There- 
fore, in order to discuss the applicability of the higher 
loop exponent c « 2.1 to real DNA sequence, it is the 
difference in persistence length of dsDNA and ssDNA 
in this range of temperatures which should be investi- 
gated. At room temperatures the persistence lengths are 
^dsiT = 20°C) « 500 A, while ^ss{T = 20°C) « 40 A 
corresponding to roughly 100 bps and 8 bps, respectively 
[sof. Both conformational fluctuations and electrostatic 
interactions contribute to the persistence length of ds- 
DNA in solution [sj ■ In the limit of high salt concentra- 
tions electrostatic interactions are totally screened and 
the persistence length is expected to scale as function of 
the temperature as in the classical wormlike chain model 
1 i-S- C = f^/T, with K temperature independent. This 
formula implies a reduction of about 20% of the dsDNA 
persistence length in the melting region compared to the 
room temperature value. However the assumption that 
the electrostatic interactions are totally screened may not 
be fully justified for the range of salt concentrations used 
here {[Na] = O.IM or lower). Thus the 20% should be 
rather considered as an upper bound. 

A more relevant effect for the reduction of the dsDNA 
persistence length is the proliferation of small denatu- 
rated bubbles within a dsDNA segment. Within the 
model considered here the characteristic lengths close to 
melting, as the loop and double helix segment lengths, 
scale as 1/a. As pointed out before, a large cooperativ- 
ity (small a), would imply typically long loops and long 
double helical segments. However, the conclusion that 
short (i.e. ~ 10 bps) loops would be totaly suppressed 
due to the small a, is too simplistic. Several studies 
showed that the nearest neighbor model largely underes- 
timates the opening probability for small loops (see the 
discussion in [l| and references therein), so it cannot be 
a quantitatively reliable tool at too short length scales. 
Such short loops are present in real DNA samples. 

We can demonstrate this effect explicitly by analyzing 
the experimental melting curves recently obtained [SJl in 
a study of dsDNA bubble dynamics. This study was per- 
formed on short sequences (of about 30 bps) containing 
an internal AT region surrounded on both sides by short 
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FIG. 9: Probability of finding the i = 1 and i = 17 in an open 
state plotted as function of the temperature for the sequence 
M18 of Ref. Q for CT = 1.26 x 10^" (a) and a = 1.26 x 10"^ 
(b). 



CG clamps. Experiments show that the inner AT region 
melts at temperatures in which the CG edges are still 
bound '33'! . 

In Fig. El we present the calculated melting curves 
for the probabilities of having the base pair i = I and 
i = 17 in an open state for the sequence Afig of Ref. 
|33l | , which can be directly compared with the experimen- 
tal results as both quantities ^(1) and A{17) have been 
measured experimentally through fluorescence measure- 
ments 33] (the base pair i = 17 is in the middle of the 
AT-region). As the sequence is very short the calculated 
melting curves are not sensitive to a change in c. For 
a = 1.26 X 10~^ (Fig. M^a)) the typical value for the co- 
operativity parameter used previously in the paper, our 
calculations show that A{1) and A{17) are at all tem- 
peratures very close to each other, which implies that no 
loops are formed and that the sequence rather melts from 
edge openings, although the CG edges are energetically 
more stable than the inner AT region. This is an effect 
of the small value of a used in the calculation. In order 
to verify this we have plotted in Fig. IHl^b), just as an 
illustration, the same quantities with cr = 5 x 10^^. The 
reduced cooperativity allows for the formation of loops 
and now produces results closer to what is observed in 
experiments (see Fig. 2 of Ref. [s^). Thus, despite that 
the small cooperativity parameter (cr ~ 10~^) correctly 
describes long loop (> 100 bps) formations in DNA melt- 
ing, small loops (~ 10 bps) openings in AT-rich domains 
are still possible. The importance of small bubbles for- 
mation in DNA oligomers melting has been recently em- 
phasized in an analysis of the melting of DNA oligomers 

As the ssDNA has, at 90° C, an estimated persistence 
length corresponding to roughly 5 bps, we expect that 
the opening of a small loop of about 15 bps would be 
sufficient to decorrelate completely the two dsDNA seg- 
ments at its two sides. To provide an estimate of the 
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persistence length of the dsDNA one would need to know 
the average density of small loops and their probabilities, 
which depends on the sequence composition. It is how- 
ever conceivable that this effect would make the dsDNA 
at 80 — 90°C much more flexible, compared to its room 
temperature behavior, so that the use of the loop embed- 
ded exponent for c is justified. 

V. CONCLUSION 

In this paper we have analyzed the effect of 
reparametrizing the loop weight contribution, in the cal- 
culation of DNA melting curves for sequences of a broad 
range of lengths, up to the full genome of the E-coli 
(5.6 X 10^ bps). Using the closure exponent for a loop 
embedded in a chain (c = 2.15) we found that in order 
to reproduce correctly the existing experimental data and 
melting curves one needs to increase the cooperativity pa- 
rameter a by about one order of magnitude. An increase 
of the cooperativity parameter, which is the weight asso- 
ciated with the interruption of an helix to form a loop, 
implies that loops are more probable within the chain. 

It is clear that a simultaneous change of the loop ex- 



ponent c and of cr cannot reproduce two perfectly over- 
lapping melting curves. We found that rescaling a by 
about one order of magnitude together with a change in 
c from 1.75 to 2.15 produces typically very small shifts 
in peak positions (~ 0.1°) and heights. Accurate melting 
experiments would be able to distinguish between the two 
choices and fix unequivocally both c and a. In any case 
our analysis indicates that the best samples where to test 
the above effects are sequences of intermediate lengths 
(k, 10'^ — lO'^ bps). We showed that in these sequences 
rather large loops may be formed and that the associ- 
ated melting peaks are extremely sensitive to a change 
in loop parameters c and cr. The disadvantage of shorter 
sequences is that they predominantly melt through end 
openings (unless they are designed to do otherwise, as 
in the example of the preceding section). The melting 
curves of very long sequences, as we showed for E. Coli, 
are only weakly affected by a change in the parameters c 
and cr. 
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